(Almost) No Label No Cry
نویسندگان
چکیده
In Learning with Label Proportions (LLP), the objective is to learn a supervised classifier when, instead of labels, only label proportions for bags of observations are known. This setting has broad practical relevance, in particular for privacy preserving data processing. We first show that the mean operator, a statistic which aggregates all labels, is minimally sufficient for the minimization of many proper scoring losses with linear (or kernelized) classifiers without using labels. We provide a fast learning algorithm that estimates the mean operator via a manifold regularizer with guaranteed approximation bounds. Then, we present an iterative learning algorithm that uses this as initialization. We ground this algorithm in Rademacher-style generalization bounds that fit the LLP setting, introducing a generalization of Rademacher complexity and a Label Proportion Complexity measure. This latter algorithm optimizes tractable bounds for the corresponding bag-empirical risk. Experiments are provided on fourteen domains, whose size ranges up to ≈300K observations. They display that our algorithms are scalable and tend to consistently outperform the state of the art in LLP. Moreover, in many cases, our algorithms compete with or are just percents of AUC away from the Oracle that learns knowing all labels. On the largest domains, half a dozen proportions can suffice, i.e. roughly 40K times less than the total number of labels.
منابع مشابه
Quantitative Assessment of Cry in Term and Preterm Infants: Long-Time Average Spectrum Analysis
Long-time average spectrum (LTAS) was used to analyze the cry utterance of 26 infants under four months old; 16 of them were full-term and the other 10 infants were preterm. The results of first spectral peak (FSP), mean spectral energy (MSE), spectral tilt (ST), high frequency energy (HFE) were used to compare the cry production between term and preterm infants. In addition, cry duration and p...
متن کاملNo Interactions of Stacked Bt Maize with the Non-target Aphid Rhopalosiphum padi and the Spider Mite Tetranychus urticae
In the agroecosystem, genetically engineered plants producing insecticidal Cry proteins from Bacillus thuringiensis (Bt) interact with non-target herbivores and other elements of the food web. Stacked Bt crops expose herbivores to multiple Cry proteins simultaneously. In this study, the direct interactions between SmartStax® Bt maize producing six different Cry proteins and two herbivores with ...
متن کاملSpectrum Analysis of Cry Sounds in Preterm and Full-Term Infants
Long-time average spectrum (LTAS) was used to analyze the cry phonations of 26 infants under four months old; 16 of them are full-term and the other 10 infants are preterm. The results of first spectral peak, mean spectral energy, spectral tilt, high frequency energy were used to compare the cry phonatory between full-term and preterm infants. In addition, cry duration and percent phonation is ...
متن کاملSpecificity and Combinatorial Effects of Bacillus Thuringiensis Cry Toxins in the Context of GMO Environmental Risk Assessment
Stacked GM crops expressing up to six Cry toxins from Bacillus thuringiensis (Bt) are today replacing the formerly grown single-transgene GM crop varieties. Stacking of multiple Cry toxins not only increase the environmental load of toxins but also raise the question on how possible interactions of the toxins can be assessed for risk assessment, which is mandatory for GM crops. However, no oper...
متن کاملBreastfeeding or oral sucrose solution in term neonates receiving heel lance: a randomized, controlled trial.
OBJECTIVE The purpose of this work was to compare the efficacy of breastfeeding versus orally administered sucrose solution in reducing pain response during blood sampling through heel lance. METHODS; We conducted an open-label, randomized, controlled trial at a neonatal unit of a public hospital in northern Italy on 101 term neonates undergoing heel lance with an automated piercing device for ...
متن کامل